Speaker adaptation using tree structured shared-state HMMs

نویسندگان

  • Jun Ishii
  • Masahiro Tonomura
  • Shoichi Matsunaga
چکیده

This paper proposes a novel speaker adaptation method that exibly controls state-sharing of HMMs according to the amount of adaptation data. In our scheme, acoustic modeling is combined with adaptation to e ciently utilize the acoustic models sharing characteristics for adaptation. The shared-state set of HMMs is determined by using tree-structured shared-state HMMs created from the history recorded for acoustic model generation. The proposed method is applied to the parameter-tying and parameter-smoothing techniques. Experiments have been performed on a Japanese phoneme recognition test using continuous density mixture Gaussian HMMs. Using 50 adaptation phrases, a 42% reduction in the phoneme recognition error rate from the speaker-independent model was achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Bayesian tree-structured transformation of HMMs with optimal model selection for speaker adaptation

This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform or adapt a set of hidden Markov model (HMM) parameters for a new speaker and gain large performance improvement from a small amount of adaptation data. By constructing a clustering tree of HMM Gaussian mixture components, the linear...

متن کامل

Cluster adaptive training with factorized decision trees for speech recognition

Cluster adaptive training (CAT) is a popular approach to train multiple-cluster HMMs for fast speaker adaptation in speech recognition. Traditionally, a cluster-independent decision tree is shared among all clusters, which could limit the modelling power of multiple-cluster HMMs. In this paper, each cluster is allowed to have its own decision tree. The intersections between the triphones subset...

متن کامل

Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition

To recognize non-native speech, larger acoustic/linguistic distortions must be handled adequately in acoustic modeling, language modeling, lexical modeling, and/or decoding strategy. In this paper, a novel method to enhance MLLR adaptation of acoustic models for non-native speech recognition is proposed. In the case of native speech recognition, MLLR speaker adaptation was successfully introduc...

متن کامل

High-speed speaker adaptation using phoneme dependent tree-structured speaker clustering

The tree-structured speaker clustering was proposed as a highspeed speaker adaptation method. It can select the model which is most similar to a target speaker. However, this method does not consider speaker difference dependent on phoneme class. In this paper, we propose a speaker adaptation method based on speaker clustering by taking speaker difference dependent on phoneme class into account...

متن کامل

Incremental on-line speaker adaptation in adverse conditions

In this paper, we examine the use of speaker adaptation in adverse noise conditions. In particular, we focus on incremental on-line speaker adaptation since it, in addition to its other advantages, enables joint speaker and environment adaptation. First, we show that on-line adaptation is superior to off-line adaptation when realistic changing noise conditions are considered. Next, we show that...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996